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Abstract 


This report describes an initial replication 
study of the PRECISE system and devel¬ 
ops a clearer, more formal description of 
the approach. Based on our evaluation, we 
conclude that the PRECISE results do not 
fully replicate. However the formalization 
developed here suggests a road map to fur¬ 
ther enhance and extend the approach pio¬ 
neered by PRECISE. 

After a long, productive discussion with 
Ana-Maria Popescu (one of the authors 
of PRECISE) we got more clarity on the 
PRECISE approach and how the lexi¬ 
con was authored for the GEO evalua¬ 
tion. Based on this we huilt a more di¬ 
rect implementation over a repaired for¬ 
malism. Although our new evaluation is 
not yet complete, it is clear that the sys¬ 
tem is performing much better now. We 
will continue developing our ideas and 
implementation and generate a future 
report/puhlication that more accurately 
evaluates PRECISE like approaches. 

1 Introduction 


It is no secret that the cost of configuring 
and maintaining natural language interfaces to 
databases is one of the main obstacles to their 
wider adoption( |Androutsopoulos, et. al., 2000 1. 
While recent work has focused on learn¬ 
ing approaches, there are less costly al¬ 
ternatives based on only lightly naming 
database elements (e.g. relations, attributes, 
values) and reducing question interpreta¬ 
tion to graph match (Chu and Meng, 1999 


Popescu,Etzioni and Kautz, 20031. 


What is particularly compelling about 
PRECISE (Popescu,Etzioni and Kautz, 2003 


Popescu, et. al., 20041 is the claim that for a large 


and well defined class of semantically tractable 
questions, one can guarantee correct transla¬ 
tion to SQE. Eurthermore PRECISE leverages 
off-the-shelf open domain syntactic parsers to 
help guide query interpretation, thus requiring no 
tedious grammar configuration. Unfortunately 
after PRECISE was introduced there has not 
been much if any follow up. This paper aims to 
evaluate these claims by implementing the model 
and conducting experiments equivalent those 
done by the designers of PRECISE. 


Movie( title ,year,studio,director,lead) 

Shows(movie,theater) Studiofname, country, premier) 


Theater{name, city) 


1 - primary 

1 -► foreign 

[movie] Movie 

[shows]Shows 

[film] Movie 

[theater]Theater 

[title] Movie.title 

[name]Theater.name 

[name] Movie.title 

[city]Theater.city 

[year] Movie.year 

[studio]studio 

director] Movie.director 

[name]studio.name 

[lead]Movie.lead 

[country]studio.country 


[Dirty Harry) Movie.title="Dirty Harry” 

[Unforgiven] Movie.title=”Unforgiven" 

[1971] Movie.year=1971 
[1992] Movie.year=1992 

[Clint Eastwood] Movie.director="Clint Eastwood” 
[Don Siegel] Movie.director="Don Siegel" 


[Clint Eastwood] Movie.lead=”Clint Eastwood" 
[the Westwood] Theater.name="Westwood” 

(Los Angeles] Theater.city="LA” 


Eigure 1: Example schema and partial lexicon 

Consider the example database schema depicted 
at the top of figure [T] Although this schema 
is small, it contains a many-to-many-relationship 
(movies to theaters) and a many-to-one rela¬ 
tion from movie to studio. The schema is also 
cyclic (via foreign key-based joins) based on the 
somewhat contrived foreign key premier from 
Studio to Theater to indicate that a studio 
shows their premiers in a specific theater. 

2 A more ‘precise’ formalization 
2.1 The Database 

Databases are represented as a disjoint set of rela¬ 
tions R, attributes A and values V which together 














are the database elements E = RiJ AUV . The 
function relOf : A ^ R and attOf : V ^ A 
gives the relation of an attribute and the attribute 
of a value respectively. The Boolean function 
key : A —> {true, false} is true for attributes that 
are primary keys of their corresponding relations. 

2.2 Words, phrases and the lexicon 

We consider W to be the set of words in a 
natural language and the set of phrases V to 
be all finite non-empty word sequences. We 
speak of Wi being f-th word of the phrase p = 
[mi, ...,Wn\,p[i] = Wi, and |p| is the length 

of p. WH = {[who], [which], [what], [where], 
[when], [how]} C V and Stop = {[are], [the], 
[on], [a], [in], [is], \be], [of], [do], [with], [have], 
[/ios]} C V. Assume a special function 
stem : W ^ W which stems words according 
to morphology of the natural language. The 
lexicon L C "P x is a set of phases paired 
with database elements. See the bottom part 
of figure [U for an example lexicon. Finally 
assume fhe funcfion compWH : AU R ^ 2^^^ 
which associafes wifh every affribufe and re¬ 
lation a sef of compatible WH-words (e.g. 
comp\NH{Movie.nanie) = {[which], [what]}). 

2.3 Assigning words to phrases 

A user quesfion q is a sequence of words q = 
[mi,..., m^j. An off fhe shelf synfacfic parser de- 
fermines an attachment relation befween words. 
Formally, AWq{i,j) {i,j} C {l,..,n} A 
Wi attaches to Wj. 

A covering assignment C, : {!,.., n} —P^U 
Stop U WH observes the following properties: 

1 . (words belong to phrases) 

if C(i) = Pj then (3e)((pj,e) G C) M pj € 

Stop U WH 

2 . (phrases are complete) 

if C(i) = Pj and i = 1 V(C(i — 1) = PkAk ^ 
j), then (Vm)((m G N)(m > 0) A (m < 
]pj]) =^> stem(g[i -h mj) = stem(pj[m])) 

The set of lexicon phrases in the range of is 
This corresponds to what the authors of PRE¬ 
CISE call tokenization. 

2.4 Mapping to database elements 

Consider : P^ —)■ P to be an injective function 
with image . This corresponds to the matching 


process in the PRECISE papers where each phrase 
is paired uniquely with a database element. 

We define a binary attachment relation AE^^ on 
the elements in which carries the attachment 
information on words to attachment relations on 
elements. Eormally, (Vei)(Vej)(AP 0 ^ (e*, ej) 
{ei,ej} C E^^ A (3mi/)(3mj/)((/)^(C(t')) = e* A 
= ej) A AWq{wii,Wji)) 

A mapping that satisfies the following addi¬ 
tional constraints is valid: 

1 . (unique focus) 

(3!ey 

ocus ){efocus^E^^n{AuR)) 

2 . (necessary value correspondences) 

(Ve)(e G Ejj^Ae G P (attOf(e) G P,/,^ A 
AP0^(e, attOf(e))) V (relOf(attOf(e)) G 
Ej)^ A AP(^^(e, relOf(attOf(e)))) V 
key(attOf(e)) = true) 

3. (necessary attribute correspondences) 

(\/e)i^e G P(/)^ Ac G A A e ejo^us —^ 
((3!e')(e' G P^^ A e' G P A attOf(e') = 
e A AE^^{e,ef))) 

4. (necessary relation correspondences) 

(Ve)(e G Ej)^ Ae (z R (Ile^)(e G Ej^^A 

(e' G A A relOf(e') — e) V {e' G VA 

att i- Of(re l Of(eO) - e))) 

Property 1 states that there is a distinguished at¬ 
tribute or relation that is the focus of the question. 
Property 2 states that values must be paired with 
either an attribute (e.g. “... title unforgiven ...”), 
or via ellipsis paired with a relation (e.g. “... the 
movie unforgiven”), or, if the value is a key itself, 
we have a highly elliptical case where the value 
may stand on its own (e.g. “unforgiven”). Prop¬ 
erty 3 says that non-focus attributes must pair with 
a value (e.g. in “...movies of year 2000...” 2000 
serves this role). Property 4 was included in the 
PRECISE papers, but we found it unnecessary. 

2.5 Semantically tractable questions 

Definition 1 (Semantically Tractable Question) 
For a given question q, lexicon C and attach¬ 
ment relation AWq, q is semantically tractable if 
there exists a covering assignment Q over q for 
which there is a valid mapping: and C, as¬ 

signs a word in q to WH which is compatible with 

focus C E(f>q. 






Definition 2 (Unambiguous Semantically 

Tractable Question) For a given ques¬ 
tion q, lexicon C and attachment rela¬ 
tion AWq, q is unambiguous semantically 
tractable if q is semantically tractable 

and (VC')(VC") ^ 

is valid => 

Figure |2] shows three valid mappings given the 
schema and lexicon in figure [U An additional ex¬ 
ample is “what films did Don Siegal direcf with 
lead Clint Eastwood?” This is a unambiguous se¬ 
mantically tractable question so long as ‘Don Sie¬ 
gal’ attaches to ‘direct’ and not ‘lead’, and ‘Clint 
Eastwood’ attaches to ‘lead’ and not ‘direct’. 


"What is the year of the film with name unforgiven?" 

i i M i i i i i . ^ 

WH [year] [film] [name] [Unforgiven] 


[Movie.year | Movie Movie.title 

Movie.title="Unforgiven" 


"The movie unforgiven is what year?" 

i 1 I ill 

[movie][Unforgiven] wH [year] 


Movie 

Movie.title="Unforgiven" 


j Movie.year j 


"what films are showing at the Westwood" 

1 i i i i \/ ^ 


WH [movie] 



Movie 


[shows] [the Westwood] 

i i 

Shows Theater.name="The Westwood" 


Eigure 2: Example valid mappings 


2.6 Generating SQL 

The PRECISE papers say little about generating 
SQE from sets of database elements. That said, 
it seems fairly straight forward. The focus ele¬ 
ment becomes the attribute (or * in case focus is 
a relation) in the SQE SELECT clause. All the in¬ 
volved or implied relation elements are included 
in the FROM clause. The value elements deter¬ 
mine the simple equality conditions in the WHERE 
clause. Adding the join conditions is not formal¬ 
ized in PRECISE, but we assume it means adding 
the minimal set of equality joins necessary to span 
all relation elements. Eor cyclic schemas this can 
lead to ambiguity. Eor example, while there is 
a unique valid mapping for the question “What 
movies at the Westwood”, join paths via studio 
or shows are possible in the schema of figure [T] 


3 Our Implementation 


Our JAVA-based open-source implemenfafiorQ, 
corresponds fo fhe formal definifion of secfion 2. 
Eike PRECISE, Q assignmenfs are compufed via 
a brute force search and candidate valid mappings 
are solved for via reducfion fo graph max-flow. 
Candidate solutions are tillered based on allach- 
menl relafions oblained from fhe Slanford Parser 


( |De Marneffe, ef. ah, 2006| ). We generale all pos¬ 
sible SQE queries for all valid mappings. 


4 Our Evaluation 

Eike fhe earlier work, we evaluafed our system 
on GeoquervU. Since very little information has 
been disclosed regarding how PRECISE purport¬ 
edly handled superlatives (“Whaf is the most pop¬ 
ulous city in America?”), aggregation (“What is 
the average population of cities in Ohio?”), and 
negation (“Which states do not border Kansas?”), 
we simply excluded these types of questions from 
our evaluation. This reduced our tests to 442 (of 
880) Geoquery Questions. 

In theory, PRECISE could be deployed im¬ 
mediately on any relational database. However, 
we found the automatic approach to be very er¬ 
ratic, generating many irrelevant synonyms. Part 
of speech-tagging (POS), which can help to nar¬ 
row down the senses of a word, is difficult to 
determine automatically from database element 
names. Even with the correct POS identified a 
word mighf have irrelevanf senses which muddle 
fhe lexicon. Eor example, WordNef has 26 noun 
senses of fhe word poinf in fhe Geoquery aflribufe 
highlow. lowest_point, one of which has a 
synonym being ‘sfafe’. Hence we decided fo man¬ 
ually add mappings fo fhe lexicon. Another rea¬ 
son to do this was to map relevant phrases which 
would not have been generated automatically oth¬ 
erwise. Eor example, to correctly answer the ques¬ 
tion “What major rivers are in Texas?” the phrase 
[major river] had to be associated with the 
relation river. 

Out of these 448 questions, 162 were answered 
correctly by our replication of PRECISE. This 
does not accord to previously published recall re¬ 
sults (see figure [3]). On the positive side, there 
were no questions for which PRECISE returned 
a single wrong query. 

'https://github.com/everling/PRECISE 

^WWW.cs.utexas.edu/users/ral/geo.html 















Figure 3: A negative replication result 



Figure 4: Sources of rejection 

Figure Inbreaks down the reasons why the 286 
remaining questions for were rejected by our sys¬ 
tem: 94 questions contained no WH-word, 17 sen¬ 
tences contained non-stop words which the lexi¬ 
con did not recognize as part of any phrase, 45 
questions had at least one but no cj)i^ could be 
found that mapped one-to-one and onto a set of el¬ 
ements, 41 questions had a (j)(^ that was one-to-one 
and onto, but no valid mapping could be found, 
and 89 questions produced multiple distinct solu¬ 
tions. 

5 Discussion 

A natural question is, “did we faithfully repli¬ 
cate PRECISE?” The description of PRECISE 
was spread over two conference articles and a cou¬ 
ple of unpublished manuscripts. A forthcoming 
journal article was referenced, but unfortunately it 
does not seem to have been published. Several as¬ 
pects of PRECISE were ambiguous, contradictory 
or incomplete and forced us to make interpreta¬ 
tions, which, if wrong, could have an impact on 
recall. Still we made every effort to boost eval¬ 
uation results. Eor example, in section 2.4 we re¬ 
moved condition 4 from valid mappings and added 
the underlined condition in 2. In section 2.2 we 
added the additional stop words and WH-words to 
boost recall. Einally we omitted certain foreign 
keys from the lexicon to limit needless ambiguity. 
We stand by the formalization presented in sec¬ 
tion 2 as a reasonable interpretation of PRECISE, 
although we are open to correction. 


While the recall results did not replicate, at face 
value precision results do appear to hold up; if one 
reads the questions under reasonable interpreta¬ 
tions, all the semantically tractable questions map 
to what intuitively seems to be the correct SQE. 
Still one must limit this claim. Consider that there 
is only one valid mapping for the question “what 
are the titles of films directed by George Eucas?”, 
however a user may be disappointed if they ex¬ 
pect the database to also contain his student films. 
Similar misconcepfions could be presenf for af- 
fribufes and values. This aside, our way fo judge 
correcfness is based on common sense, assuming 
fhaf fhe user fully undersfands fhe confexf of fhe 
dafabase. Thai said, the semantically tractable 
class does not seem to be fundamental. We have 
generalized fhe class and nofhing seems lo blocks 
fhe exlension of fhe class lo queslions requiring 
aggregafion, superlafives, negafion, self-joins, elc. 
Also, fhe currenf semanlically Iracfable class ex¬ 
cludes queslions lhal seem simple (e.g. “which 
films are showing in los angeles?” is nol seman¬ 
lically Iraclable). Eulure work is needed lo more 
cleanly define and limil ‘semanlically Iraclable’. 

An issue lhal complicates PRECISE is Ihe role 
of ambiguily. If Ihe user asks “whal are Ihe lilies of 
Ihe Clinl Easlwood films?”, Ihere are several pos- 
sibililies: 1. The films he directed; 2. Ihe films he 
acted in; 3. Ihe films he bolh acted and directed 
in; 4. Ihe films he eilher acted or directed in. Only 
1 and 2 are expressible in PRECISE. Slill if Ihere 
was a paraphrasing capabilily, fhe user could se- 
lecl Iheir intended inlerprelalion. This leads lo an 
immediate slralegy lo improve praclical ’recall’. 
Anolher immediate idea is lo extend PRECISE lo 
handle ellipsis of WH-words. 

A more serious issue is Ihe hidden assumplions 
PRECISE makes aboul Ihe form of Ihe schema. 
Nalural language interfaces do better when Ihe 
schema mainlains a clear relalion wilh a con- 
ceplual model (e.g. Enlily-Relalionship model). 
This is Ihe case for example we developed, bul 
il is nol completely Ihe case for GEOQUERY 
which conlains fables such as HighLow which 
have no real enlily correspondence. Nol sur¬ 
prisingly many of Ihe rejected queslions in our 
evalualion involved Ihis conceplually suspecl fa¬ 
ble. Whal is needed is a more specific delinealion 
of exaclly whal schemas PRECISE is applicable 
over. We shall look invesligale Ihis Iheorelically 
as well as empirically, invesligaling for example 










how well PRECISE and it generalizations cover 
QAED dWalter, et. ah, 2012| ) and other corpora. 

6 Conclusions 

Our replication of PRECISE made no errors in 
terms of returning a single, incorrect query, giving 
it the highest possible precision value. However, 
out of the 448 questions given, PRECISE was only 
able to produce SQE queries for 162, giving it a re¬ 
call value of 0.361. Moreover our implementation 
of PRECISE requires manual lexicon configura¬ 
tion. Still, even given this ‘negative’ result, we feel 
that PRECISE is a very appealing approach, but 
one that needs more careful scrutiny, testing and 
generalization. This is something we shall con¬ 
tinue to investigate. 
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